Search CORE

Mapping the genetic architecture of gene expression in human liver

Author: Avila-Campillo I
Chudin E
Derry J
Drake TA
Guengerich FP
GuhaThakurta D
Hao K
Johnson JM
Kasarskis A
Kruger MJ
Lamb J
Lum PY
Lusis AJ
Mehrabian M
Millstein J
Molony C
Rohl CA
Rushmore TH
Schadt EE
Schuetz E
Sieberts S
Smith RC
Storey JD
Strom SC
Suver C
Ulrich R
Van Nas A
Wang S
Yang X
Zhang B
Zhu J
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/05/2008
Field of study

Genetic variants that are associated with common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in higher-order disease traits. Therefore, identifying the molecular phenotypes that vary in response to changes in DNA and that also associate with changes in disease traits has the potential to provide the functional information required to not only identify and validate the susceptibility genes that are directly affected by changes in DNA, but also to understand the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. Toward that end, we profiled more than 39,000 transcripts and we genotyped 782,476 unique single nucleotide polymorphisms (SNPs) in more than 400 human liver samples to characterize the genetic architecture of gene expression in the human liver, a metabolically active tissue that is important in a number of common human diseases, including obesity, diabetes, and atherosclerosis. This genome-wide association study of gene expression resulted in the detection of more than 6,000 associations between SNP genotypes and liver gene expression traits, where many of the corresponding genes identified have already been implicated in a number of human diseases. The utility of these data for elucidating the causes of common human diseases is demonstrated by integrating them with genotypic and expression data from other human and mouse populations. This provides much-needed functional support for the candidate susceptibility genes being identified at a growing number of genetic loci that have been identified as key drivers of disease from genome-wide association studies of disease. By using an integrative genomics approach, we highlight how the gene RPS26 and not ERBB3 is supported by our data as the most likely susceptibility gene for a novel type 1 diabetes locus recently identified in a large-scale, genome-wide association study. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease and plasma low-density lipoprotein cholesterol levels in the process. © 2008 Schadt et al

D-Scholarship@Pitt

Modeling Disordered Regions in Proteins Using Rosetta

Author: A Leaver-Fay
A Zemla
AK Dunker
CA Rohl
CJ Oldfield
David Baker
E Alm
HJ Dyson
JJ Ward
Kristina Krassovsky
Michael Tyka
MV Berjanskii
Ray Yu-Ruei Wang
Vladimir N. Uversky
William Sheffler
Y Shen
Y Shen
Yan Han
Publication venue: Public Library of Science
Publication date: 29/07/2011
Field of study

Protein structure prediction methods such as Rosetta search for the lowest energy conformation of the polypeptide chain. However, the experimentally observed native state is at a minimum of the free energy, rather than the energy. The neglect of the missing configurational entropy contribution to the free energy can be partially justified by the assumption that the entropies of alternative folded states, while very much less than unfolded states, are not too different from one another, and hence can be to a first approximation neglected when searching for the lowest free energy state. The shortcomings of current structure prediction methods may be due in part to the breakdown of this assumption. Particularly problematic are proteins with significant disordered regions which do not populate single low energy conformations even in the native state. We describe two approaches within the Rosetta structure modeling methodology for treating such regions. The first does not require advance knowledge of the regions likely to be disordered; instead these are identified by minimizing a simple free energy function used previously to model protein folding landscapes and transition states. In this model, residues can be either completely ordered or completely disordered; they are considered disordered if the gain in entropy outweighs the loss of favorable energetic interactions with the rest of the protein chain. The second approach requires identification in advance of the disordered regions either from sequence alone using for example the DISOPRED server or from experimental data such as NMR chemical shifts. During Rosetta structure prediction calculations the disordered regions make only unfavorable repulsive contributions to the total energy. We find that the second approach has greater practical utility and illustrate this with examples from de novo structure prediction, NMR structure calculation, and comparative modeling

Anchored Design of Protein-Protein Interfaces

Author: A Gulyani
A Koide
A Koide
A Koide
A Leaver-Fay
A Skerra
AA Bogan
AA Canutescu
AE Schmidt
B Kuhlman
Brian A. Kuhlman
BS Chevalier
C Wang
CA Rohl
CA Rohl
DJ Mandell
DJ Mandell
DJ Mandell
DW Sammond
E Karatan
EA Coutsias
G Song
HM Berman
J Janin
J Karanicolas
JB Siegel
JJ Gray
JM Shifman
JM Shifman
JN Haidar
JR Wallen
KA Reynolds
KW Kaufmann
L Lo Conte
LMC Meireles
MD Daily
PM Murphy
PS Huang
RK Jha
RL Stanfield
S Liu
SJ Fleishman
SM Lippow
Steven M. Lewis
T Clackson
T Kortemme
T Kortemme
T Kortemme
V Batori
V Potapov
Vladimir N. Uversky
W Kabsch
X Hu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Few existing protein-protein interface design methods allow for extensive backbone rearrangements during the design process. There is also a dichotomy between redesign methods, which take advantage of the native interface, and de novo methods, which produce novel binders.Here, we propose a new method for designing novel protein reagents that combines advantages of redesign and de novo methods and allows for extensive backbone motion. This method requires a bound structure of a target and one of its natural binding partners. A key interaction in this interface, the anchor, is computationally grafted out of the partner and into a surface loop on the design scaffold. The design scaffold's surface is then redesigned with backbone flexibility to create a new binding partner for the target. Careful choice of a scaffold will bring experimentally desirable characteristics into the new complex. The use of an anchor both expedites the design process and ensures that binding proceeds against a known location on the target. The use of surface loops on the scaffold allows for flexible-backbone redesign to properly search conformational space.This protocol was implemented within the Rosetta3 software suite. To demonstrate and evaluate this protocol, we have developed a benchmarking set of structures from the PDB with loop-mediated interfaces. This protocol can recover the correct loop-mediated interface in 15 out of 16 tested structures, using only a single residue as an anchor

CiteSeerX

Carolina Digital Repository

Adaptive evolution of the vertebrate skeletal muscle sodium channel

Author: Chiyu Zhang
Felsenstein J
Geffeney S
Geffeney SL
Goldin AL
Jian Lu
Jianzhou Zheng
Keping Chen
Lipkind GM
Lopreato GF
Marban E
Miyazawa K
Mosher HS
Narahashi T
Novak AE
Qinggang Xu
Rohl CA
Soong TW
Stuhmer W
Tamura K
Thompson JD
Venkatesh B
Yang Z
Yokoo A
Yu FH
Zhang J
Publication venue: Sociedade Brasileira de Genética
Publication date: 01/01/2011
Field of study

Tetrodotoxin (TTX) is a highly potent neurotoxin that blocks the action potential by selectively binding to voltage-gated sodium channels (Nav). The skeletal muscle Nav (Nav1.4) channels in most pufferfish species and certain North American garter snakes are resistant to TTX, whereas in most mammals they are TTX-sensitive. It still remains unclear as to whether the difference in this sensitivity among the various vertebrate species can be associated with adaptive evolution. In this study, we investigated the adaptive evolution of the vertebrate Nav1.4 channels. By means of the CODEML program of the PAML 4.3 package, the lineages of both garter snakes and pufferfishes were denoted to be under positive selection. The positively selected sites identified in the p-loop regions indicated their involvement in Nav1.4 channel sensitivity to TTX. Most of these sites were located in the intracellular regions of the Nav1.4 channel, thereby implying the possible association of these regions with the regulation of voltage-sensor movement

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RosettaRemodel: A Generalized Framework for Flexible Backbone Protein Design

Author: A Robbins
AA Canutescu
AM Slovic
AM Wollacott
B Kuhlman
B Kuhlman
B Qian
BE Correia
BE Correia
BJ Smagghe
C Micklatcher
C Wang
CA Rohl
CA Rohl
D Rothlisberger
David Baker
DJ Mandell
E Yosef
F Richter
Florian Richter
FV Cochran
G Dantas
G Grigoryan
G Grigoryan
G Guntas
I Andre
I Goreshnik
Ingemar Andre
J Ashworth
J Ashworth
J Swift
JB Siegel
JJ Havranek
JR Calhoun
L Jiang
PM Murphy
Po-Ssu Huang
PS Huang
PS Shah
R Das
RK Jha
Robert Vernon
S Cooper
SB Thyme
SJ Fleishman
SM Malakauskas
T Van Montfort
TP Treynor
UJ Shukla
Vladimir N. Uversky
William R. Schief
XZ Hu
Yih-En Andrew Ban
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

We describe RosettaRemodel, a generalized framework for flexible protein design that provides a versatile and convenient interface to the Rosetta modeling suite. RosettaRemodel employs a unified interface, called a blueprint, which allows detailed control over many aspects of flexible backbone protein design calculations. RosettaRemodel allows the construction and elaboration of customized protocols for a wide range of design problems ranging from loop insertion and deletion, disulfide engineering, domain assembly, loop remodeling, motif grafting, symmetrical units, to de novo structure modeling

CiteSeerX

Lund University Publications

Predicting the Tolerated Sequences for Proteins and Protein Interfaces Using RosettaBackrub Flexible Backbone Design

Author: A Ernst
A Leaver-Fay
A Leaver-Fay
AE Sauer-Eriksson
B Kuhlman
B Kuhlman
CA Rohl
CA Smith
CA Smith
CA Voigt
Colin A. Smith
CT Saunders
DJ Mandell
DM Fowler
EL Humphris
EL Humphris
F Ding
G Fuh
G Pál
GD Friedland
GD Friedland
GD Friedland
GP Smith
HL Schmidt
I Georgiev
I Georgiev
I Georgiev
IW Davis
JD Bloom
JD Kotz
JJ Havranek
JR Desjarlais
KM Frey
MD Distefano
N Metropolis
N Ollikainen
N Pokala
NJ Marini
PB Harbury
R Tonikian
RL Dunbrack
RP Laura
SM Larson
T Clackson
T Kortemme
Tanja Kortemme
TP Treynor
Vladimir N. Uversky
X Fu
X Hu
XI Ambroggio
XI Ambroggio
Publication venue: Public Library of Science
Publication date: 18/07/2011
Field of study

Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s) are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface), interactions between and within parts of the structure (e.g. domains) can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others

The eNMR platform for structural biology

Author: A Mittermaier
A Rosato
Alexandre M. J. J. Bonvin
Antonio Rosato
C Dominguez
CA Rohl
CD Schwieters
G Pintacuda
HM Berman
I Bertini
J Janin
J Moult
J Wöhnert
K Wüthrich
L Banci
M Billeter
MF Lensink
P Guntert
P Güntert
SJ Vries de
T Herrmann
T Ikegami
Tsjerk A. Wassenaar
WF Vranken
Y Shen
Y Shen
Y Shen
Publication venue: Springer Netherlands
Publication date: 01/01/2010
Field of study

The e-NMR project is a European cooperation initiative that aims at providing the bio-NMR user community with a software platform integrating and streamlining the computational approaches necessary for the analysis of bio-NMR data. The e-NMR platform is based on a Grid computational infrastructure. A main focus of the current implementation of the e-NMR platform is on streamlining structure determination protocols. Indeed, to facilitate the use of NMR spectroscopy in the life sciences, the eNMR consortium has set out to provide protocolized services through easy-to-use web interfaces, while still retaining sufficient flexibility to handle specific requests by expert users. Various programs relevant for structural biology applications are already available through the e-NMR portal, including HADDOCK, XPLOR-NIH, CYANA and csRosetta. The implementation of these services, and in particular the distribution of calculations to the GRID infrastructure, has required the development of specific tools. However, the GRID infrastructure is maintained completely transparent to the users. With more than 150 registered users, eNMR is currently the second largest European Virtual Organization in the life sciences

Springer - Publisher Connector

Florence Research

Utrecht University Repository

Rational Design of Temperature-Sensitive Alleles Using Computational Structure Prediction

Author: B Cunningham
B Lee
C Cortes
Ca Rohl
Christopher S. Poultney
CJ Burges
David Gresham
Dennis E. Shasha
EH Kellogg
G Chakshusmathi
Glenn L. Butterfoss
HM Muller
JM Word
JR Quinlan
K Bajaj
K Drew
KD Pruitt
Kevin Drew
Kristin C. Gunsalus
M Hall
Michelle R. Gutwein
N Eswar
N Siew
R Varadarajan
Richard Bonneau
RJ Dohmen
S Tweedie
SF Altschul
SF Altschul
TW Harris
Vladimir N. Uversky
WS Noble
WS Sandberg
Publication venue: Public Library of Science
Publication date: 02/09/2011
Field of study

Temperature-sensitive (ts) mutations are mutations that exhibit a mutant phenotype at high or low temperatures and a wild-type phenotype at normal temperature. Temperature-sensitive mutants are valuable tools for geneticists, particularly in the study of essential genes. However, finding ts mutations typically relies on generating and screening many thousands of mutations, which is an expensive and labor-intensive process. Here we describe an in silico method that uses Rosetta and machine learning techniques to predict a highly accurate “top 5” list of ts mutations given the structure of a protein of interest. Rosetta is a protein structure prediction and design code, used here to model and score how proteins accommodate point mutations with side-chain and backbone movements. We show that integrating Rosetta relax-derived features with sequence-based features results in accurate temperature-sensitive mutation predictions

arXiv.org e-Print Archive

Atomic-accuracy prediction of protein loop structures through an RNA-inspired ansatz

Author: A Fiser
A Hildebrand
A Leaver-Fay
AA Canutescu
AC Martin
AM Buckle
B Kuhlman
BD Sellers
C Levinthal
C Wang
CA McPhalen
CA McPhalen
CA Rohl
Charlotte M. Deane
D Xu
DE Kim
DJ Mandell
DJ Mandell
F DiMaio
FC Chou
GF Schroder
GW Harris
IK McDonald
J Carlsson
J Desmet
J Hockenmaier
JA Cruz
JC Grigg
JJ Gray
JM Word
K Zhu
KD Gibson
L Kinch
LL Videau
MH Chu
N Eswar
N Ollikainen
NB Hammond
NJ Baird
P Sripakdeevong
P Vallurupalli
PB Harbury
R Das
R Das
R Kratzner
R Savva
Rhiju Das
S Raman
S Raman
S Vajda
SB Ozkan
SJ Chen
SJ Fleishman
SR Eddy
T Kortemme
T Kortemme
T Lazaridis
T Schwede
Y Urakubo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 24/05/2013
Field of study

Consistently predicting biopolymer structure at atomic resolution from sequence alone remains a difficult problem, even for small sub-segments of large proteins. Such loop prediction challenges, which arise frequently in comparative modeling and protein design, can become intractable as loop lengths exceed 10 residues and if surrounding side-chain conformations are erased. This article introduces a modeling strategy based on a 'stepwise ansatz', recently developed for RNA modeling, which posits that any realistic all-atom molecular conformation can be built up by residue-by-residue stepwise enumeration. When harnessed to a dynamic-programming-like recursion in the Rosetta framework, the resulting stepwise assembly (SWA) protocol enables enumerative sampling of a 12 residue loop at a significant but achievable cost of thousands of CPU-hours. In a previously established benchmark, SWA recovers crystallographic conformations with sub-Angstrom accuracy for 19 of 20 loops, compared to 14 of 20 by KIC modeling with a comparable expenditure of computational power. Furthermore, SWA gives high accuracy results on an additional set of 15 loops highlighted in the biological literature for their irregularity or unusual length. Successes include cis-Pro touch turns, loops that pass through tunnels of other side-chains, and loops of lengths up to 24 residues. Remaining problem cases are traced to inaccuracies in the Rosetta all-atom energy function. In five additional blind tests, SWA achieves sub-Angstrom accuracy models, including the first such success in a protein/RNA binding interface, the YbxF/kink-turn interaction in the fourth RNA-puzzle competition. These results establish all-atom enumeration as a systematic approach to protein structure that can leverage high performance computing and physically realistic energy functions to more consistently achieve atomic resolution.Comment: Identity of four-loop blind test protein and parts of figures 5 have been omitted in this preprint to ensure confidentiality of the protein structure prior to its public releas